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Candida heat shock protein, cDNA and uses thereof 

The invention concerns the cDNA and the 
corresponding protein of a heat shock protein isolated 
from C. albicans , and fragments thereof to develop 
5 methods to identify C. albicans in biological and/or 
environment samples, and/or preparations either for 
therapeutic, prophylaxis or vaccine purpose. 

Pathogenic yeasts are the major agents of 
opportunistic infections in immunosuppressed patients, in 
10 particular AIDS, tumor, neutropenia, patients or bone 
marrow transplanted ' subjects (1) . HIV* subject 
susceptibility to c. albicans is related to the strong 
decrease of cell-mediated immunity because of the 
numerical and functional decrease of CD4 + helper-inducer 
15 lymphocytes (2) . 

C. albicans cell wall mannoproteins and heatshock 
proteins of other microorganisms as well, are major 
antigens and immunomodulators, and play a relevant role 
during host invasion and infection (3,4). 
20 B y using a rabbit immune serum obtained against 

heat-inactivated C. albicans ATCC 20955 strain cells, the 
authors of the instant invention isolated the caRLV130 
clone from an expression library in the Xgtll phage 
obtained by cDNA isolated from C. albicans at the yeast 
25 growth stage. Said clone contains a DNA insert of 2325 
base pairs which codes in the 5' -3' direction from +105 
to +2072 for a 656 aminoacid protein having, a strong 
homology with a S. cerevisiae heat shock protein 70. 

HSPs are induced by different stresses, either 
chemical or physical, normally by heating. Many HSPs are 
present and active also in non stressed cells, where they 
play important functions of cell physiology 
Pchaperonins") . They may be grouped in families of 
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different molecular weights, very conserved even among 
phylogenesis distant organisms (5) . Therefore it should 
not be surprising either that HSPs are involved in the 
immune response, or that they represent major antigens of 
different pathogenic agents, or that they may give 
autoimmune responses, given to the fact that the 
infection itself represents an extreme form of stress, 
both for the infectious agent and for the host (41. 

It is therefore an object of the invention a nucleic 
acid comprising a nucleotide sequence coding the protein 
having the amino acid sequence of SEQ ID No. 2 or parts 
thereof. Preferably the nucleic acid comprises a 
nucleotide sequence with at least a 65% homology with 
the nucleotide sequence of SEQ ID No.l or parts thereof. 
More preferably the nucleic acid comprises the nucleotide 
sequence of SEQ ID No.l or parts thereof. 

Further object of the invention is a composition 
comprising a nucleic acid comprising a nucleotide 
sequence coding the protein having the amino acid 
sequence of SEQ ID No.l or parts thereof. Preferably the 
composition comprises a nucleic acid having a nucleotide 
sequence with at least a 65% homology with the nucleotide 
sequence of SEQ ID No.l or parts thereof. More preferably 
the composition comprises a nucleic acid having the 
nucleotide sequence of SEQ ID No.l or parts thereof. 

Further object of the invention is the use of the 
nucleic acid comprising a nucleotide sequence coding the 
protein having the amino acid sequence of SEQ ID No. 2 or 
parts thereof for oligonucleotide probes to be used in 
diagnosis and typing of Candida related pathologies. The 
use of a nucleic acid having at least a 65% homology with 
the nucleotide sequence of SEQ ID No.l or parts thereof 
is preferred. The use of a nucleic acid having the 
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nucleotide sequence of SEQ ID No.l or parts thereof is 
most preferred. 

The oligonucleotides of the invention are 
advantageously used for PCR (polymerase chain reaction) 
5 to detect the presence in biological and/or environment 
samples either of C. albicans or of other Candida species 
or of yeast-like related microorganisms comprising said 
gene; in a labeled form (radionuclides, biotin, enzymes, 
etc.) to detect the presence in biological and/or 

10 environment samples either of C. albicans or of other 
related; for the C. albicans or related species typing 
and/or diagnosis; as potential antibiotic and/or 
chemiotherapic targets, or antisense RNA active for 
Candida species and/or yeast-like related microorganisms 

15 coding an homologous sequence. 

Another object of the invention, is a polypeptide 
having the aminoacid sequence comprised in the SEQ ID 
No. 2, or having at least a 50% homology with SEQ ID No. 2 
or fragments, and/or functional and immunologic 

20 homologous thereof. 

Further object of the invention is a composition 
comprising a polypeptide having an amino acid sequence 
comprised in SEQ ID No. 2 or having at least a 50% 
homology with SEQ ID No. 2 or fragments, and/or 

25 functional and immunologic homologous thereof. 

Further object of the invention is the use of a 
polypeptide having the amino acid sequence comprised in 
SEQ ID No. 2 or having at least a 50% homology with SEQ ID 
No. 2 or of fragments, and/or functional and/or 

30 immunologic homologous thereof to make polyclonal or 
monoclonal antibodies against the 70 kd heat shock 
protein (HSP70) of C. albicans or related species. 

Further object of the invention is the use of a 
polypeptide having the amino acid sequence comprised in 
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SEQ ID No .2 or having at least a 50* homology with SEQ ID 
No. 2 or of fragments, and/or functional and/or 
immunologic homologous thereof to detect C. albicans and 
related species HSP70 in a biological sample having a 
human, animal or environmental origin. 

Further object of the invention is the use of a 
polypeptide having the amino acid sequence comprised in 
SEQ ID No. 2 or having at least a 50% homology with SEQ ID 
No. 2 or of fragments, and/or functional and/or 
immunologic homologous thereof for the preparation of a 
composition to be used for prophylaxis and/or therapy of 
C. albicans or related microorganisms (pathogenic yeasts) 
diseases . 

Further object of the invention is the use of a 
polypeptide having the amino acid sequence comprised in 
SEQ ID No. 2 or having at least a 50% homology with SEQ ID 
No. 2 or of fragments, and/or as potential antibiotic 
and/or chemiotherapic targets active for Candida species 
and/or yeast-like related microorganisms coding an 
homologous sequence . 

The invention will be described in different 
embodiments for clarifying but not limiting purposes. 

Figure 1 represents the 1971 base pair DNA sequence 
(small letters) corresponding to the open reading frame 
of Xgtll- (caRLA130) clone insert and deduced aminoacid 
sequence (capital letters one-letter code) . 

Figure 2 represents the nucleotide sequence of the 
coding insert of caRLV130 clone (small letter) and 
comparison with S. cerevisiae YSCSSA1 gene (capital 
letter) . 

Figure 3 represents the 656 aminoacid sequence 
deduced from the coding insert of caRLV130 clone (small 
letter) and comparison with the S. cerevisiae YSCSSA1 



WO 96/36707 PCT/IT96/00097 




gene (capital letter) . The aminoacid code utilized is the 
one letter code. 

Figure 4 represents in panel A. Southern blot 
analysis of C. albicans strain ATCC 20955 chromosomes, 
5 obtained by pulse field electrophoresis (TAFE) . The 
caRLV130 probe labeling refers to the highest molecular 
weight chromosome (3.5 Mbp) . In panel B. Electrophoretic 
separation of C. albicans strain ATCC 20955 chromosomes. 

Figure 5 represents on the left side: Northern blot 

10 analysis by hybridization of total RNA extracted from C^ 
albicans cells grown at 22°C and transferred at 37°C for 
the time indicated with radiolabeled caRLV130 (cahsp70) 
and actin probes. The actin probe hybridization was 
performed to control the RNA amount on filters (see ref. 

15 8) . On the right side: immunoblotting reactivity of anti- 
CAHSP70 mouse serum with C. albicans extracts, at 
different times further to inducing a heat shock response 
as previously described. 

Figure 6 represents in panel A. SDS-PAGE analysis: 

20 a) expression products of E. coli Ml 5 containing the 
pDS56/RBS-E"-6his caRLV130/l plasmid; b) expression 
products of E. coli Ml 5 containing the pDS56/RBS-E"-6his 
caRLV130/2 plasmid; c) expression products of E. coli M15 
containing the pDS56/RBS-E"-6his caRLV130/3 plasmid; d) 

25 expression products of E. coli M15 containing the 
pDS56/RBS-E~-6his caRLV130/4 plasmid. N.I.: Non induced 
E. coli culture extracts. I.: 1 mM IPTG induced E. coli 
culture extracts. P.: Purified fraction on histidine 
affinity nickel column from 1 mM IPTG induced E. coli 

30 culture extracts. In panel B. Schematic representation of 
caRLV130 coding sequence portions cloned into recombinant 
plasmids used in panel A. Right side: molecular weight in 
kDa. Left side: denomination of the expression product of 
recombinant plasmid. For further details, see table I. 
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Figure 7 represents the reactivity after 
immunoblotting on nitrocellulose filters of mouse sera as 
shown in the figure obtained against CAKSP70 fragments; 
a) expression products of nickel column purified 
pDS56/RBSII-E--6his caRLV130/l plasmid in 1 mM IPTG 
induced E. coli ; b) expression products of nickel column 
purified pDS56/RBSII-E~-6his caRLV130/2 plasmid in 1 mM 
IPTG induced M15 E . coli ; c) expression products of 
nickel column purified pDS56/RBSII-E"-6his caRLV130/3 
plasmid in 1 mM IPTG induced M15 E. coli ; d) expression 
products of nickel column purified pDS56/RBSII-E"-6his 
caRLV130/4 plasmid in 1 mM IPTG induced M15 E. coli (see 
also Fig. 6 and table I for a definition of polypeptide 
fragments) . Left side: molecular weight of purified 
15 fragments. 

Figure 8 represents the reactivity after 
immunoblotting on nitrocellulose filters of wealthy human 
sera obtained against CAHSP70 and fragments thereof; a) 
expression products of nickel column purified 
20 pDS56/RBSII "--6his caRLV130/l plasmid in 1 mM IPTG 
induced Mli 2. coli ; b) expression products of nickel 
column purified pDS56/RBSII-E'-6his caRLV130/2 plasmid in 
1 mM IPTG induced M15 E. coli ; c) expression products of 
nickel column purified pDS56/RBSII-E _ -6his caRLV130/3 
25 plasmid in 1 mM IPTG induced M15 E. coli ; d) expression 
products of nickel column purified pDS56/RBSII-E"-6his 
caRLV130/4 plasmid in 1 mM IPTG induced M15 E. coli . Left 
side: molecular weight of purified fragments. Right side: 
denomination of purified protein fragments. For further 
30 details see also table I. 

Figure 9 represents in panel A. PCR experiment 
performed using oligonucleotide combination CA2-CA3 in 
the presence of C. albicans , C. paraosilosis (2), 
glabrata (3), C. guillermondii (4), C. krusei (5), C. 
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tropicalis (6), Mus muris (7), E. coli (8), S, cerevisiae 
(9) DNAs. Control with no DNA is as (10). At the right 
side the molecular weight of the amplified fragment is 
indicated. In panel B. PCR experiment using the 
5 combination of CA1-CA4 oligonucleotides in the presence 
of C. albicans cDNA: DNA amplified from C. albicans DNA: 
10 ng (2); 1 ng (3); 100 pg (4); 10 pg (5); 1 pg (6). 
Control: reaction with no DNA (1) . PCR reaction 
conditions are as follows: 90 sec. 94 °C denaturation; 90 

10 sec. 60°C annealing; 120 sec. 72°C extension; 25 cycles. 

Figure 1 shows the 1971 bp coding region of the 
isolated gene. 

The caRLV130 sequence was filed with EMBL data base 
(No. Z30210) . No intron can be found in the intronic 

15 sequence, as shown by PCR product analysis and by 
"Southern-blot" . By comparing the caRLV130 insert 
sequence with sequences present in the 6.7 version "GENE 
BANK" data base, some homologies can be detected. The 
insert shows the most high homology with the 

20 cerevisiae gene SSA1 (one of the nine heatshock yeast 
gene family) . The overall nucleotide sequence homology is 
of 78.8% in the coding region (figg. 2 and 3). 

The gene corresponding to the caRLV130 sequence was 
mapped on the C. albicans chromosome showing the highest 

25 molecular weight (3.5 Mpb) by pulse field electrophoresis 
(transverse-alternate: TAFE) utilizing the caRLV130 
labeled cDNA insert as hybridization probe with C . 
albicans chromosomes blotted on nitrocellulose filters 
(fig. 4A an 4B) . Gene transcription is activated by 

30 exposing cells to a temperature higher than room 
temperature (thermal shift from 22°C to 37°C) . Such 
finding was demonstrated by hybridization experiments 
using C. albicans total RNA (from cells grown either at 
22°C or at 37°C, fractionated according to molecular 
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weight on formaldehyde agarose gel and blotted on 
nitrocellulose filters) and the caRLV130 DNA insert as 
radioactive probe. The induction of transcription is 
coupled also to an increase of protein expression, 2, 6 
5 and 24 hours further to the 22°C to 37°c temperature 
shift (see fig, 5) . 

Different portions of the caRLV130 insert sequence 
were cloned in the expression plasmid pDS56/RBSII-E'-6his 
(6) , and coded polypeptides were expressed in E. coli 

10 after fusion of their amino terminal sequence with 6 
histidine residues. The histidine stretch allowed to a 
rapid and efficient purification of polypeptides derived 
from the caRLV130 insert sequence on nickel columns (see 
fig. 6 and table I for denomination and length of 

15 polypeptide fragments) . 
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After purification, recombinant peptides were used 
as immunogens to produce mouse immune sera and are 
therefore able also to induce monoclonal antibodies. 
Therefore, according the immunization schedule shown in 
table II, polypeptides, and the whole purified protein as 
well, induce specific- antibodies in a 18-22 g weight 
Balb/c mouse. 

Table II 

Immunization schedule of 18-22 g weight Balb/c mice with 
CAHSP70 peptides purified as described in the text and in 
Fig. 6. 

Immunogen Immunization First boost Second boost Serum titer 

(day 1) (day 21) (day 41) (day 51) 

CAHSP70 5 \xg 5 ]ig 10 \ig > 12.800 

CAHSP70/2 5 ]ig 5 ug ' 10 ug > 12.800 

CAHSP70/3 5 ug 5 ug 10 ]ig > 12.800 

CAHSP70/4 5 ug 5 ]ig 10 jig > 12.800 



The indicated immunogen concentration was inoculated 
intraperitoneally in a 2 00 jal volume. 

The titer was determined by indirect ELISA with the 
antigen used for coating at a 200 ng/well concentration, 
in a final volume of 100 til, and represents the highest 
serum dilution able to give an ELISA positive reaction 
(optical density at 405 run > two fold the no antigen 
control value) . 

Serum titers for each antigen resulted to be > 
12 . 800 by immunoenzyme test (indirect ELISA) with the 
adsorbed antigen at' 200 ng/well, in a final volume of 100 
Ul . The specificity of immunoenzyme test results were 
confirmed in immunoblot experiments on nitrocellulose 
filters, as shown in Fig. 7. 

The same polypeptides were utilized as immunogens in 
proliferation assays on peripheral human blood 
lymphocytes by evaluating the 3 H-thymidine uptake further 
to 7 day culturing according to standard techniques (7) . 
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Results obtained with different donors (two examples are 
shown in table III) demonstrate that CAHSP70 is able to 
induce a good thymidine uptake and the proliferation of 
naive lymphocytes from umbilical cord blood (Table IV) , 
suggesting that the protein itself or parts thereof has a 
mitogenic activity. 

Table III 

Peripheral blood lymphocytes proliferation induction 
activity of CAHSP70 and fragments thereof 



inducing materials 



dose 



lymphoprolif erative 

activity 
3 H-thymidine uptake 

(cpm ± SD/2xl0 4 cells) 



none 
MP-F2 
IL-2 
CAHSP70 
CAHSP70/2 
CAHSP70/3 
CAHSP70/4 



50 ug/ml 
100 U/ml 
1 ug/ml 
1 ug/ml 
1 ug/ml 
1 ug/ml 



500 
13.393 
28.205 
8.730 
2.900 
3.600 



200 

11.555 

18.014 

5.181 

2.300 

2.700 



11.685 ± 8.174 



15 



Lymphoproliferation of wealthy donor peripheral blood 
lymphomonocyte cultures further to induction with the 
CAHSP70 cloned fragments. Positive controls: C. albicans 
mannoproteic antigen (MP-F2) and Inter leukin-2 (IL-2) . 
Negative controls: no materials. Shown values represent 
average values ± SD from 7 experiments with 5 different 
donors. 3 H-thymidine uptake was determined after 7 days 
of culture. For technical details, see ref. 7. 
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Table IV 

Umbilical cord blood cell proliferation 
activity of CAHSP70 and fragments thereof 



induction 



inducing materials 



dose 



none 
IL-2 
MP-F2 
CAHSP70 
CAHSP70/2 
CAHSP70/3 
CAHSP70/4 



lymphoproliferative 

activity 
J H-thymidine uptake 
(cpm ± SD/2xl0 4 cells) 

cord blood 1 cord blood 2 



100 U/ml 
50 ug/ml 
1 ug/ml 
1 ug/ml 
1 ug/ml 
1 ]ig/ml 



2.5 ± 0.4 

37.7 ± 4.5 
3.0 ± 1.4 

12.5 ± 1.8 
18.2 ± 3.0 

23.8 ± 5.4 
14.8 ± 3.9 



1.3 ± 0.3 
32.8 ± 6.0 

1.5 ± 0.4 
22.8 ± 6.6 

23.1 ± 3.9 
20.6 ± 9.2 

17.2 ± 1.7 



Proliferation of two donor umbilical cord blood cultures 
further to induction with the CAHSP70 cloned fragments. 
Positive controls: C. albicans mannoproteic antigen (MP- 
F2) and Inter leukin-2 (IL-2). Negative controls: no 
materials. Shown values represent average values ± SD 
from 3 wells. For technical details, see table III legend 
and ref. 7. 



Furthermore, immunoblotting experiments revealed the 
presence of anti-CAHSP70 antibodies in sera from adult 
wealthy humans, and in particular of the anti-CAHSP70/4 
fragment (Fig. 8), suggesting that this fragment contains 
the immunodominant epitope. Taken together, 
lymphoproliferations human serum immunoblotting data 
suggest inequivocabilly that CAKSP70 is recognized by the 
immune system during the Candida usual colonization of 
healthy subjects. 

Moreover, in immunoblotting on nitrocellulose 
filters, anti-CAHSP70 murine sera recognize more than one 
component of the HSP70 family from heat induced 
albicans extracted proteins (Fig. 5), thus showing that 
the expression product of caRLV130 insert is a C. 
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albicans protein which is expressed after the heat shock. 
According to the above results we named as CAHSP70 the C. 
albicans protein having the following properties: I) it 
comprises the aminoacid sequence coded by the caRLV130 
5 insert; II) its gene maps on C. albicans chromosome 1 
(having the highest molecular weight); III) its 
expression is induced by temperature shift; IV) it 
induces specific antibodies able to recognize cloned and 
purified fragments (subunits) ; V) it induces a 
10 lymphoproliferation in lymphomonocytic cultures from 
peripheral human blood. The relevant gene was named as 
cahsp70 . 

The CAHSP70 cloning, and its molecular and 
biochemical characterization, allows to develop a 

15 diagnostic molecular method based upon the amplification 
of DNA inserts corresponding to caRLV130, other than 
immunological studies of C. albicans 70 kDa heat shock 
protein expression. According to the caRLV130 insert 
sequence, we have synthesized oligonucleotides which were 

20 utilized for polymerase chain reaction (PCR) experiments, 
to analyze their ability to amplify DNA fragments which 
are homologous to C. albicans caRLV130 DNA. Two 
oligonucleotides (CA2-CA4) were chosen in the regions 
showing the minimal homology between the caRLV130 cDNA 

25 sequence and known HSP70 coding gene sequences (see Fig. 
2 for the caRLV130 and YSCSSA1 sequence aligning, see 
Table V for the definition of minimal homology regions 
and Table VI for the sequence of oligonucleotides which 
were utilized for the assay) . 

30 The combination of CA2 (GAAATGAAAGATAAGATTGGTGCA) and 

CA3 (CCACAGTAAATTACCTATTTCTTCCTC) oligonucleotides is 
able to amplify DNA fragments having the expected size 
and a sequence specific of C. albicans DNA (Fig. 9A) , 
whereas the assay sensitivity is shown in Fig. 9B by 
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using CA1 (ATGTCTAAAGCTGTTGGTATTG) and CA4 
(CTGCACCAATCTTATCTTTCATTTCACCATCATT) oligonucleotides . 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

5 (i) APPLICANT: 

(A) NAME: Istituto Superiore di Sanita 1 

(B) STREET: Viale Regina Elena 299 

(C) CITY: Rome 

(E) COUNTRY: Italy 

10 (F) POSTAL CODE (ZIP): 00161 

(A) NAME: Universita 1 degli Studi di Roma La Sapienza 

(B) STREET: P.le Aldo Moro 5 
<C) CITY: Rome 

15 (E) COUNTRY: Italy 

(F) POSTAL CODE (ZIP): 00184 

(ii) TITLE OF INVENTION: Candida heath shock protein, gene and 
uses thereof 

20 

(iii) NUMBER OF SEQUENCES: 2 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
25 (B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 



(2) INFORMATION FOR SEQ ID NO: 1: 



30 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2001 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 
<B) LOCATION: 1. .1968 

40 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

ATG TCT AAA GCT GTT GGT ATT GAT TTA GGT ACA ACC TAT TCT TGT GTT 48 
Met Ser Lys Ala Val Gly lie Asp Leu Gly Thr Thr Tyr Ser Cya Val 
5 1 5 10 15 

GCT CAT TTT GCC AAT GAT AGA GTT GAA ATT ATT GCT AAT GAT CAA GGT 96 

Ala His Phe Ala Asn Asp Axg Val Glu lie lie Ala Asn Asp Gin Gly 

20 25 30 

10 

AAT AGA ACT ACC CCT TCA TTT GTT GCC TTC ACT GAT ACT GAA AGA TTG 144 

Asn Arg Thr Thr Pro Ser Phe Val Ala Phe Thr Asp Thr Glu Arg Leu 

35 40 45 

15 ATT GGT GAT GCT GCC AAG AAT CAA GCT GCT ATG AAC CCA GCA AAC ACT 192 
He Gly Asp Ala Ala Lys Asn Gin Ala Ala Met Asn Pro Ala Asn Thr 
50 55 60 

GTT TTC GAT GCT AAA CGT TTA ATT GGG AGA AAA TTT GAT GAT CCA GAA 240 
20 Val Phe Asp Ala Lys Arg Leu He Gly Arg Lys Phe Asp Asp Pro Glu 
65 70 75 80 

GTT ATA AAT GAT GCT AAA CAT TTC CCA TTT AAA GTC ATT GAT AAA GCA 288 
Val He Asn Asp Ala Lys His Phe Pro Phe Lys Val He Asp Lys Ala 
25 85 90 95 

GGT AAA CCA GTG ATT CAA GTT GAA TAT AAA GGT GAA ACT AAA ACT TTT 336 

Gly Lys Pro Val He Gin Val Glu Tyr Lys Gly Glu Thr Lys Thr Phe 
100 105 110 

30 

TCA CCA GAA GAA ATT TCT TCA ATG GTT TTA ACA AAA ATG AAA GAA ATT 384 

Ser Pro Glu Glu He Ser Ser Met Val Leu Thr Lys Met Lys Glu He 
115 120 125 

35 GCT GAA GGT TAT TTG GGT TCT ACT GTT AAA GAT GCT GTT GTT ACT GTT 432 
Ala Glu Gly Tyr Leu Gly Ser Thr Val Lys Asp Ala Val Val Thr Val 
130 135 140 

CCA GCT TAT TTC AAT GAT TCT CAA AGA CAA GCC ACC AAA GAT GCT GGT 480 
4 0 Pro Ala Tyr Phe Asn Asp Ser Gin Arg Gin Ala Thr Lys Asp Ala Gly 
145 150 155 160 

ACT ATT GCT GGT TTG AAT GTT TTA AGA ATT ATT AAT GAA CCT ACT GCT 528 
Thr He Ala Gly Leu Asn Val Leu Arg He He Asn Glu Pro Thr Ala 
45 165 170 175 
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GCT GCC ATT GCT TAT GGT TTA GAT AAA AAA GGT TCC AGA GGT GAA CAT 576 

Ala Ala lie Ala Tyr Gly Leu Asp Lys Lys Gly Ser Arg Gly Glu His 
180 185 190 

5 

AAT GTT TTA ATT TTC GAT TTG GGT GGT GGT ACT TTT GAT GTT TCA TTA 624 

Asn Val Leu He Phe Asp Leu Gly Gly Gly Thr Phe Asp Val Ser Leu 
195 200 205 

10 TTA GCC ATT GAT GAA GGT ATT TTC GAA GTT AAA GCC ACT GCT GGT GAT 672 
Leu Ala He Asp Glu Gly He Phe Glu Val Lys Ala Thr Ala Gly Asp 
210 215 220 

ACT CAT TTG GGT GGT GAA GAT TTT GAT AAC AGA TTA GTC AAC TTC TTT 720 
15 Thr His Leu Gly Gly Glu Asp Phe Asp Asn Arg Leu Val Asn Phe Phe 
225 230 235 240 

ATT CAA GAA TTC AAG AGA AAG AAC AAG AAA GAT ATT TCC ACC AAC CAA 768 
He Gin Glu Phe Lys Arg Lys Asn Lys Lys Asp He Ser Thr Asn Gin 
20 245 250 255 

AGA GCT TTA AGA AGA TTA AGA ACT GCT TGT GAA AGA GCC AAG AGA ACT 816 
Arg Ala Leu Arg Arg Leu Arg Thr Ala Cys Glu Arg Ala Lys Arg Thr 
260 265 270 

25 

TTG TCT TCT TCT GCT CAA ACC TCA ATT GAA ATT GAT TCC TTA TAT GAA 864 
Leu Ser Ser Ser Ala Gin Thr Ser He Glu He Asp Ser Leu Tyr Glu 
275 280 285 

30 GGT ATT GAC TTC TAC ACT TCA ATC ACC AGA GCC AGA TTT GAA GAA TTG 912 
Gly - He Asp Phe Tyr Thr Ser He Thr Arg Ala Arg Phe Glu Glu Leu 
290 295 300 

TGT GCT GAC TTG TTT AGA TCC ACT TTA GAT CCA GTT GGT AAA GTT TTA 960 
35 Cys Ala Asp Leu Phe Arg Ser Thr Leu Asp Pro Val Gly Lys Val Leu 
305 310 315 320 

GCT GAT GCC AAG ATT GAT AAA TCT CAA GTT GAA GAA ATT GTC TTG GTT ' 1008 
Ala Asp Ala Lys He Asp Lys Ser Gin Val Glu Glu He Val Leu Val 
40 325 330 335 

GGT GGG TCC ACC AGA ATT CCA AAG ATT CAA AAA TTG GTT TCT GAT TTC 105 6 

Gly Gly Ser Thr Arg He Pro Lys He Gin Lys Leu Val Ser Asp Phe 

340 345 350 

45 
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TTT AAT GGT AAA GAA TTG AAT AAA TCT ATC AAC CCT GAT GAA GCT GTT 1104 
Phe Asn Gly Lys Glu Leu Asn Ly3 Sei He Asn Pro Asp Glu Ala Val 
355 360 365 

5 GCT TAT GGT GCT GCT GTT CAA GCT GCC ATT TTA ACT GGT GAT ACT TCT 1152 
Ala Tyr Gly Ala Ala Val Gin Ala Ala He Leu Thr Gly Asp Thr Ser 
370 375 380 

TCC AAG ACT CAA GAT ATT TTG TTA TTG GAT GTT GCT CCA TTG TCA TTA 1200 
10 Ser Lys Thr Gin Asp He Leu Leu Leu Asp Val Ala Pro Leu Ser Leu 
385 390 395 400 

GGT ATT GAA ACT GCT GGT GGT ATC ATG ACC AAA TTG ATT CCA AGA AAT 1248 
Gly He Glu Thr Ala Gly Gly He Met Thr Lys Leu He Pro Arg Asn 
15 405 410 415 

TCT ACT ATT CCA ACT AAG AAA TCA GAA ACT TTC TCC ACT TAT GCC GAT 1296 
Ser Thr He Pro Thr Lys Lys Ser Glu Thr Phe Ser Thr Tyr Ala Asp 
420 425 430 

20 

AAC CAA CCA GGT GTT TTG ATT CAA GTG TTT GAA GGT GAA AGA GCT AAA 1344 
Asn Gin Pro Gly Val Leu He Gin Val Phe Glu Gly Glu Arg Ala Lys 
435 440 445 

25 ACT AAA GAT AAC AAC TTG TTG GGT AAA TTT GAA TTA TCT GGT ATT CCA 1392 
Thr Lys Asp Asn Asn Leu Leu Gly Lys Phe Glu Leu Ser Gly He Pro 
450 455 460 

CCA GCT CCA AGA GGC GTC CCT CAA ATT GAA GTT ACT TTC GAT ATT GAT 1440 
30 Pro Ala Pro Arg Gly Val Pro Gin He Glu Val Thr Phe Asp He Asp 
465 470 475 480 

GCT AAT GGT ATC TTG AAT GTT TCT GCT TTA GAA AAA GGT ACT GGT AAA 1488 
Ala Asn Gly He Leu Asn Val Ser Ala Leu Glu Lys Gly Thr Gly Lys 
35 485 490 495 

ACT CAA AAG ATT ACT ATC ACC AAC GAT AAA GGT AGA TTA TCC AAA GAA 1536 

Thr Gin Lys He Thr He Thr Asn Asp Lys Gly Arg Leu Ser Lys Glu - 

500 505 510 

40 

GAA ATT GAT AAA ATG GTT AGT GAA GCT GAA AAA TTC AAA GAA GAA GAT 1584 

Glu He Asp Lys Met Val Ser Glu Ala Glu Lys Phe Lys Glu Glu Asp 
515 520 525 



45 
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GAA AAG GAA GCT GCT AGA GTC CAA GCC AAG AAT CAA TTG GAA TCT TAT 1632 
Glu Lys Glu Ala Ala Arg Val Gin Ala Lys Asn Gin Leu Glu Ser Tyr 
530 535 540 

GCT TAT TCA TTG AAA AAC ACA ATC AAT GAT GGT GAA ATG AAA GAT AAG 1680 
Ala Tyr Ser Leu Lys Asn Thr He Asn Asp Gly Glu Met Lys Asp Lys 
545 550 555 560 

ATT GGT GCA GAT GAT AAA GAA AAA TTA ACT AAA GCC ATT GAT GAA ACT 1728 
He Gly Ala Asp Asp Lys Glu Lys Leu Thr Lys Ala He Asp Glu Thr 
565 570 575 

ATT TCT TGG TTA GAT GCA TCT CAA GCT GCT TCT ACT GAA GAA TAC GAA 1776 
He Ser Trp Leu Asp Ala Ser Gin Ala Ala Ser Thr Glu Glu Tyr Glu 
580 585 590 

GAT AAA CGT AAA GAA TTA GAA TCA GTT GCT AAT CCA ATC ATT AGT GGT 1824 
Asp Lys Arg Lys Glu Leu Glu Ser Val Ala Asn Pro He He Ser Gly 
595 600 605 

GCT TAT GGT GCT GCC GGT GGC GCT CCA GGT GGT GCA GGC GGA TTC CCA 1872 
Ala Tyr Gly Ala Ala Gly Gly Ala Pro Gly Gly Ala Gly Gly Phe Pro 
610 615 620 

GGT GCT GGT GGC TTC CCA GGT GGT GCC CCA GGT GCC GGT GGT CCA GGT 1920 
Gly Ala Gly Gly Phe Pro Gly Gly Ala Pro Gly Ala Gly Gly Pro Gly 
625 630 635 640 

GGT GCT ACT GGT GGT GAA TCT AGT GGA CCA ACT GTT GAA GAA GTT GAT 1968 
Gly Ala Thr Gly Gly Glu Ser Ser Gly Pro Thr Val Glu Glu Val Asp 
645 650 655 



TAAATGAGGAAGAAATAGGTAATTTACTGTGG 



2000 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 656 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ser Lys Ala Val Gly lie Asp Leu Gly Thr Thr Tyr Ser Cys Val 
15 10 15 

5 Ala His Phe Ala Asn Asp Arg Val Glu lie lie Ala Asn Asp Gin Gly 
20 25 30 

Asn Arg Thr Thr Pro Ser Phe Val Ala Phe Thr Asp Thr Glu Arg Leu 

35 MO 45 

lie Gly Asp Ala Ala Lys Asn Gin Ala Ala Met Asn Pro Ala Asn Thr 
10 50 55 60 

Val Phe Asp Ala Lys Arg Leu lie Gly Arg Lys Phe Asp Asp Pro Glu 
65 70 75 80 

Val lie Asn Asp Ala Lys His Phe Pro Phe Lys Val lie Asp Lys Ala 
85 90 95 

15 Gly Lys Pro Val lie Gin Val Glu Tyr Lys Gly Glu Thr Lys Thr Phe 
100 105 110 

Ser Pro Glu Glu lie Ser Ser Met Val Leu Thr Lys Met Lys Glu lie 

115 120 125 

Ala Glu Gly Tyr Leu Gly Ser Thr Val Lys Asp Ala Val Val Thr Val 
20 130 135 140 

Pro Ala Tyr Phe Asn Asp Ser Gin Arg Gin Ala Thr Lys Asp Ala Gly 
145 150 155 160 

Thr lie Ala Gly Leu Asn Val Leu Arg lie lie Asn Glu Pro Thr Ala 
165 170 175 

25 Ala Ala lie Ala Tyr Gly Leu Asp Lys Lys Gly Ser Arg Gly Glu His 
180 185 190 

Asn Val Leu lie Phe Asp Leu Gly Gly Gly Thr Phe Asp Val Ser Leu 

195 200 205 

Leu Ala He Asp Glu Gly He Phe Glu Val Lys Ala Thr Ala Gly Asp 
30 210 215 220 

Thr His Leu Gly Gly Glu Asp Phe Asp Asn Arg Leu Val Asn Phe Phe 
225 230 235 240 

He Gin Glu Phe Lys Arg Lys Asn Lys Lys Asp He Ser Thr Asn Gin 
245 250 255 

35 Arg Ala Leu Arg Arg Leu Arg Thr Ala Cys Glu Arg Ala Lys Arg Thr 
260 265 270 

Leu Ser Ser Ser Ala Gin Thr Ser He Glu He Asp Ser Leu Tyr Glu 

275 280 285 

Gly He Asp Phe Tyr Thr Ser He Thr Arg Ala Arg Phe Glu Glu Leu 

40 
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290 295 300 

Cys Ala Asp Leu Phe Arg Ser Thr Leu Asp Pro Val Gly Lys Val Leu 
305 310 315 320 

Ala Asp Ala Lys lie Asp Lys Ser Gin Val Glu Glu lie Val Leu Val 
5 325 330 335 

Gly Gly Ser Thr Arg lie Pro Lys He Gin Lys Leu Val Ser Asp Phe 

340 345 350 

Phe Asn Gly Lys Glu Leu Asn Lys Ser He Asn Pro Asp Glu Ala Val 
355 360 365 

10 Ala Tyr Gly Ala Ala Val Gin Ala Ala He Leu Thr Gly Asp Thr Ser 
370 375 380 

Ser Lys Thr Gin Asp He Leu Leu Leu Asp Val Ala Pro Leu Ser Leu 
385 390 395 400 

Gly He Glu Thr Ala Gly Gly He Met Thr Lys Leu He Pro Arg Asn 
15 405 410 415 

Ser Thr He Pro Thr Lys Lys Ser Glu Thr Phe Ser Thr Tyr Ala Asp 

420 425 430 

Asn Gin Pro Gly Val Leu He Gin Val Phe Glu Gly Glu Arg Ala Lys 
435 440 445 

20 Thr Lys Asp Asn Asn Leu Leu Gly Lys Phe Glu Leu Ser Gly He Pro 
450 455 460 

Pro Ala Pro Arg Gly Val Pro Gin He Glu Val Thr Phe Asp He Asp 
465 470 475 480 

Ala Asn Gly He Leu Asn Val Ser Ala Leu Glu Lys Gly Thr Gly Lys 
25 485 490 495 

Thr Gin Lys He Thr He Thr Asn Asp Lys Gly Arg Leu Ser Lys Glu 

500 505 510 

Glu He Asp Lys Met Val Ser Glu Ala Glu Lys Phe Lys Glu Glu Asp 
515 520 525 

30 Glu Lys Glu Ala Ala Arg Val Gin Ala Lys Asn Gin Leu Glu Ser Tyr 
530 535 540 

Ala Tyr Ser Leu Lys Asn Thr He Asn Asp Gly Glu Met Lys Asp Lys 
545 550 555 560 

He Gly Ala Asp Asp Lys Glu Lys Leu Thr Lys Ala He Asp Glu Thr 
35 565 570 575 

He Ser Trp Leu Asp Ala Ser Gin Ala Ala Ser Thr Glu Glu Tyr Glu 

580 585 590 

Asp Lys Arg Lys Glu Leu Glu Ser Val Ala Asn Pro He He Ser Gly 
595 600 605 

40 Ala Tyr Gly Ala Ala Gly Gly Ala Pro Gly Gly Ala Gly Gly Phe Pro 



21 



WO 96/36707 




PCT/IT96/00097 



610 615 620 

Gly Ala Gly Gly Phe Pro Gly Gly Ala Pro Gly Ala Gly Gly Pro Gly 
625 630 635 640 

Gly Ala Thr Gly Gly Glu Ser Ser Gly Pro Thr Val Glu Glu Val Asp 
645 650 655 
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Claims 



1. A nucleic acid comprising a nucleotide sequence 
coding the protein having the amino acid sequence of SEQ 

5 ID No. 2 or parts thereof. 

2. A nucleic acid comprising a nucleotide sequence 
with at least a 65% homology with the nucleotide sequence 
of SEQ ID No.l or parts thereof. 

3. A nucleic acid according to claim 2 comprising 
10 the nucleotide sequence of SEQ ID No.l or parts thereof. 

4. Composition comprising a nucleic acid according 
to any of claims 1 to 3. 

5. Use of the nucleic acid according to any of 
claims 1 to 3 for oligonucleotide probes to be used in 

15 diagnosis and typing of Candida and Candida related 
pathologies . 

6. Oligonucleotide having a sequence comprised in SEQ 
ID No. 1 to be used for PCR (polymerase chain reaction) 
to detect the presence in biological and/or environment 

20 samples either of C. albicans or of other Candida species 
or of yeast-like related microorganisms comprising said 
gene and/or in a labeled form (radionuclides, biotin, 
enzymes, etc.) to detect the presence in biological 
and/or environment samples either of C. albicans or of 

25 other related and/or for the C. albicans or related 
species typing and/or diagnosis and/or as potential 
antibiotic and/or chemiotherapic targets, or antisense 
RNA active for Candida species and/or yeast-like related 
microorganisms coding an homologous sequence. 

30 7. Polypeptide having the aminoacid sequence 

comprised in the SEQ ID No.l, or having at least a 50% 
homology with SEQ ID No. 1 or fragments, and/or 
functional and immunologic homologous thereof. 
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